Lecture 12

Dealing with many zeros

Sara Martino

Dept. of Mathematical Science, NTNU

Janine Illian

University of Glasgow

Jafet Belmont

University of Glasgow

Motivation

  • Count data are often modeled with Poisson or Negative Binomial models.

  • These assume zeros occur naturally from the count process.

    • Example: for Poisson is \(P(Y=0) = \exp(\lambda)\)
  • But in real data, we often see too many zeroszero inflation.

  • Examples:

    • Many people have 0 doctor visits per year.
    • Most customers have 0 claims.
    • Many species are absent from samples.

Two main solutions

Both handle excess zeros but use different assumptions about how zeros are generated.

  • Zero-Inflated Models

  • Hurdle Models

Two main solutions

Both handle excess zeros but use different assumptions about how zeros are generated.

  • Zero-Inflated Models

    • Two sources of zero: inflation process or count process
  • Hurdle Models

Two main solutions

Both handle excess zeros but use different assumptions about how zeros are generated:

  • Zero-Inflated Models

    • Two sources of zero: inflation process or count process
  • Hurdle Models

    • Onle one source of zeros

Zero-Inflated Models

Assume two processes:

  • A binary process → decides if the observation is a certain zero.

  • A count process → generates counts, including zeros.

\[ \begin{aligned} \text{P}(Y=0) & = \pi + (1-\pi)f(0) & \\ \text{P}(Y=y) & = (1-\pi)f(y)& y=1,2,\dots \end{aligned} \]

Note In zero-inflated models \(f(y)\) is typically discrete (ex Poisson, Binomial, …)

Hurdle Models

Also have two parts, but different logic:

  • A binary model determines if the observation crosses the hurdle (i.e. zero vs. positive).

  • A truncated count model models only positive counts.

Zeros come only from the first process.

\[ \begin{aligned} \text{P}(Y=0) & = \pi & \\ \text{P}(Y=y) & = \frac{f(y)}{1-f(0)}& y=1,2,\dots \end{aligned} \]

Note In hurdle models \(f_Y(y)\) can also be continuous (f.ex Gamma). In these cases: \[ \begin{aligned} \text{P}(Y=0) & = \pi & \\ f_Y(y) & = f(y)& y>0 \end{aligned} \] where \(f(y)\) is scaled so that it integrates to \(1-\pi\)

Comparison: Zero-Inflated vs. Hurdle Models

Feature Zero-Inflated Hurdle
Zeros come from Two sources (inflation + count) One source (hurdle only)
Positive part Includes zeros Truncated at zero
Can use continuous distributions? Typically count only Yes (Gamma, Lognormal)

Zero Inflated models in inlabru (Type 1)

  • Four zero-inflated models are implemented
    • Poisson (zeroinflatedpoisson1)
    • Binomial (zeroinflatedbinomial1)
    • Negative Binomial (zeroinflatednbinomial1)
    • BetaBinomial (zeroinflatedbinomial1)

To get details about the distributions you can type

INLA::inla.doc("zero inflated")

The probability of the inflation process \(\pi\) is a hyperparameter therefore cannot be modeled with covariates

Example: number of fishes caught by fishermen at a state park.

  nofish persons child count
1      1       1     0     0
2      0       1     0     0
3      0       1     0     0

Example: two models

\[ \begin{eqnarray} \text{Model 1: }\ & Y\sim \text{Poisson}(\lambda)\\ \text{Model 2: }\ & Y\sim \text{NegBinomial}(n,p)\\ \text{Linear predictor: }\ & \eta = \beta_0 + \beta_1x_1 + \beta_2x_2 \end{eqnarray} \]

Example: two models

\[ \begin{eqnarray} \text{Model 1: }\ & Y\sim \text{Poisson}(\lambda)\\ \text{Model 2: }\ & Y\sim \text{NegBinomial}(n,p)\\ \text{Linear predictor: }\ & \eta = \beta_0 + \beta_1x_1 + \beta_2x_2 \end{eqnarray} \]

Negative Binomial distribution

\[ \text{Prob}(Y=y) = \frac{\Gamma(y+n)}{\Gamma(n)\Gamma(y+1)}p^n(1-p)^y \] with \[ E(Y) = \mu = n\frac{1-p}{p} \qquad \text{Var}(Y) = \mu(1+\frac{\mu}{n}) \] and linear predictor \(\eta = log(\mu)\)

Example: Implementation

The Model \[ \begin{aligned} \text{Model 1 }: y_i|\eta_i & \sim \pi1_{(y=0)} + (1-\pi)\text{Pois}(\lambda(\eta_i))\\ \text{Model 2 }: y_i|\eta_i & \sim \pi1_{(y=0)} + (1-\pi)\text{NegBin}(n(\eta_i),p(\eta_i))\\ \text{Linear Predictor }: \eta_i & = \color{red}{\boxed{\beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i}}} \end{aligned} \]

zinb[1:3,]
  nofish persons child count
1      1       1     0     0
2      0       1     0     0
3      0       1     0     0

The code

# define model components
cmp = ~ Intercept(1) + fishing(nofish, model = "linear") + 
  persons(persons, model = "linear") + 
  child(child, model = "linear")

# define model predictor
formula = count ~ .

# build the observation model
# Poisson model
lik_pois = bru_obs(formula,
                   family = "zeroinflatedpoisson1",
                   data = zinb) 

# Negative binomial model
lik_nbin = bru_obs(formula,
                   family = "zeroinflatednbinomial1",
                   data = zinb) 

# fit the model
fit_pois = bru(cmp, lik_pois)
fit_nbin = bru(cmp, lik_nbin)

Example: Implementation

The Model \[ \begin{aligned} \text{Model 1 }: \color{red}{\boxed{y_i|\eta_i}} & \color{red}{\boxed{\sim \pi1_{(y=0)} + (1-\pi)\text{Pois}(\lambda(\eta_i))}}\\ \text{Model 2 }: y_i|\eta_i & \sim \pi1_{(y=0)} + (1-\pi)\text{NegBin}(n(\eta_i),p(\eta_i))\\ \text{Linear Predictor }: \eta_i & = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} \end{aligned} \]

zinb[1:3,]
  nofish persons child count
1      1       1     0     0
2      0       1     0     0
3      0       1     0     0

The code

# define model components
cmp = ~ Intercept(1) + fishing(nofish, model = "linear") + 
  persons(persons, model = "linear") + 
  child(child, model = "linear")

# define model predictor
formula = count ~ .

# build the observation model
# Poisson model
lik_pois = bru_obs(formula,
                   family = "zeroinflatedpoisson1",
                   data = zinb) 

# Negative binomial model
lik_nbin = bru_obs(formula,
                   family = "zeroinflatednbinomial1",
                   data = zinb) 

# fit the model
fit_pois = bru(cmp, lik_pois)
fit_nbin = bru(cmp, lik_nbin)

Example: Implementation

The Model \[ \begin{aligned} \text{Model 1 }: y_i|\eta_i & \sim \pi1_{(y=0)} + (1-\pi)\text{Pois}(\lambda(\eta_i))\\ \text{Model 2 }: \color{red}{\boxed{y_i|\eta_i}} & \color{red}{\boxed{\sim \pi1_{(y=0)} + (1-\pi)\text{NegBin}(n(\eta_i),p(\eta_i))}}\\ \text{Linear Predictor }: \eta_i & = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} \end{aligned} \]

zinb[1:3,]
  nofish persons child count
1      1       1     0     0
2      0       1     0     0
3      0       1     0     0

The code

bru_options_set(control.compute = list(dic = T, waic = T, mlik = T))
# define model components
cmp = ~ Intercept(1) + fishing(nofish, model = "linear") + 
  persons(persons, model = "linear") + 
  child(child, model = "linear")

# define model predictor
formula = count ~ .

# build the observation model
# Poisson model
lik_pois = bru_obs(formula,
                   family = "zeroinflatedpoisson1",
                   data = zinb) 

# Negative binomial model
lik_nbin = bru_obs(formula,
                   family = "zeroinflatednbinomial1",
                   data = zinb) 

# fit the model
fit_pois = bru(cmp, lik_pois)
fit_nbin = bru(cmp, lik_nbin)

Results - Fixed effects

[1] "Poisson Model"
           mean   sd 0.025quant 0.975quant
Intercept  0.22 0.15      -0.08       0.51
fishing   -0.90 0.12      -1.13      -0.67
persons    0.65 0.05       0.56       0.74
child     -0.95 0.10      -1.15      -0.76
[1] "Negative Binomial Model"
           mean   sd 0.025quant 0.975quant
Intercept -0.80 0.31      -1.42      -0.18
fishing   -0.59 0.25      -1.08      -0.10
persons    0.94 0.12       0.71       1.17
child     -1.65 0.19      -2.03      -1.27

Results - Hyperparameters and scores

Hyperparameters

[1] "Poisson Model"
                                                       mean   sd
zero-probability parameter for zero-inflated poisson_1 0.48 0.04
[1] "Negative Binomial Model"
                                                         mean   sd
size for nbinomial_1 zero-inflated observations          0.56 0.10
zero-probability parameter for zero-inflated nbinomial_1 0.06 0.06

Scores

  Score Poisson    Nbin
1   DIC 1111.36  795.44
2  Mlik -579.97 -424.72
3  WAIC 1207.12  798.86

Comparing estimated distribution for a specific data point

\[ \begin{eqnarray} \text{Model 1: }\ & P(Y=0) &= \pi + (1-\pi)\ \text{Poisson}(\lambda)\\ \text{Model 2: }\ & P(Y=0)&= \pi + (1-\pi)\ \text{NegBinom}(n,p)\\ \end{eqnarray} \] We check the distribution for one person, non-fisher and no children

Comparing estimated distribution - Implementation

df_pred = data.frame(nofish = 0, persons = 1, child = 0)

prob_pois = predict(fit_pois, df_pred,
        formula = ~ {
          pi0 <- zero_probability_parameter_for_zero_inflated_poisson_1
          lambda = exp(Intercept +  fishing + persons + child)
          yy = 0:20
          list(
            prob0 =  pi0 * c(1,rep(0,20)) + (1-pi0) * dpois(yy, lambda = lambda))
          } )

prob_nbin = predict(fit_nbin, df_pred,
        formula = ~ {
          pi0 <- zero_probability_parameter_for_zero_inflated_nbinomial_1
          size = exp(size_for_nbinomial_1_zero_inflated_observations)
          mu = exp(Intercept +  fishing + persons + child)
          p = size/(size + mu)
          yy = 0:20
          list(prob0 = pi0 * c(1,rep(0,20)) + (1-pi0) * dnbinom(yy,size = size, mu = mu))
        })

Comparing estimated distribution - Implementation

Note: To get the names to input in predict use the bru_standardise_names() function:

inlabru::bru_standardise_names(fit_pois)

Hurdle Models in inlabru

There are two ways to implement hurdle models in inlabru

  • Strategy 1 Use a modified version of the same distributions defined for the zero-inflated models

  • Strategy 2 Use a “two-likelihood” trick

Strategy 1 - Implemented models

Available models (Type 0)

  • Poisson (zeroinflatedpoisson0)

  • Binomial (zeroinflatedbinomial0)

  • Negative Binomial (zeroinflatednbinomial0)

  • BetaBinomial (zeroinflatedbinomial0)

Advantages

  • It works exactly as the zero-inflated models
  • No need for extra coding

Disadvantages

  • Can only be used for the models that are already implemented
  • The probability of zero (\(\pi\)) is fixed and cannot have covariates

Example Strategy 1: Implementation

The Model \[ \begin{aligned} \text{Model 1 }: y_i|\eta_i & \sim \pi1_{(y=0)} + (1-\pi)\text{TruncPois}(\lambda(\eta_i))\\ \text{Model 2 }: y_i|\eta_i & \sim \pi1_{(y=0)} + (1-\pi)\text{TruncNegBin}(n(\eta_i),p(\eta_i))\\ \text{Linear Predictor }: \eta_i & = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} \end{aligned} \]

zinb[1:3,]
  nofish persons child count
1      1       1     0     0
2      0       1     0     0
3      0       1     0     0

The code

# define model components
cmp = ~ Intercept(1) + fishing(nofish, model = "linear") + 
  persons(persons, model = "linear") + 
  child(child, model = "linear")

# define model predictor
formula = count ~ .

# build the observation model
# Poisson model
lik_pois = bru_obs(formula,
                   family = "zeroinflatedpoisson0",
                   data = zinb) 

# Negative binomial model
lik_nbin = bru_obs(formula,
                   family = "zeroinflatednbinomial0",
                   data = zinb) 

# fit the model
fit_pois0 = bru(cmp, lik_pois)
fit_nbin0 = bru(cmp, lik_nbin)

Example Strategy 1: Results

Strategy 2 - Two-likelihood trick

Advantages

  • More flexible model
  • Can model also the probability of zero \(\pi\)
  • Can also be used for continuous distributions with mass at 0

Disadvantage

  • Some more coding

Hurdle models com

The Hurdle Model handles this by splitting the data-generating process into two parts:

  1. Zero-hurdle component: This part is a binary model that estimates the probability of a zero count versus a non-zero count.
  2. Positive component: This part models the distribution of the positive (non-zero) values.

This idea can be used in inlabru by using two likelihood

  • A binomial likelihood to model the 0/1 process
  • A continuous positive density or a truncated count distribution for the positive part

Example - Gamma-Hurdle model

Daily precipitation in Parana state

\[ z_{st} = \left\{ \begin{eqnarray} 0 & \text{ if } &\text{ no rain on day } t \text{ at station } s\\ 1 & \text{ if } &\text{ rain on day } t \text{ at station } s \end{eqnarray} \right. \]

\[ y_{st} = \left\{ \begin{eqnarray} \text{NA} \text{ if } && \text{ no rain on day } t \text{ at station } s\\ \text{rain amout} \text{ if } && \text{ rain on day } t \text{ at station } s \end{eqnarray} \right. \]

\[ z_{st}\sim \text{Binom}(p_{st}, n_{st} = 1); \qquad y_{st}|z_{st}=1\sim\text{Gamma}(a,b) \ \text{with}\ E(y_{st}) = \mu_{st} = a/b \] Where \[ \begin{eqnarray} \eta_t^1 & = \text{logit}(p_t) & = \beta_0^B+u^B(s) \\ \eta_t^2 & = \log(\mu_t)& = \beta_0^G+u^G(s) \end{eqnarray} \]

Example - Gamma-Hurdle model

Implementation

  1. Create response for binary and Gamma process
df = df %>%
  mutate(z = ifelse(value==0,0,1),
         y = ifelse(value==0,NA, value),)
  1. Define components, observation models and run
cmp = ~ -1 + Intercept_z(1) + Intercept_y(1) +
  space_z(geometry, model = spde) +
  space_y(geometry, model = spde) +
  local_z(station, model = "iid")

lik_z = bru_obs(formula = z ~ Intercept_z + space_z + local_z,
                data = df %>% filter(time==1),
                family = "binomial")

lik_y = bru_obs(formula = y ~ Intercept_y + space_y,
                data =  df %>% filter(time==1),
                family = "gamma")

fit = bru(cmp, lik_z, lik_y)

Example - Gamma-Hurdle model

Results

Example - Gamma-Hurdle model

We can:

  • use other components in both linear predictors \(\eta^1\) and \(\eta^2\)
  • have shared components between the two linear predictor
  • use other positive-continuous distribution instead of the Gamma (for example log-Normal, Weibull, etc.)

Example - Poisson- Hurdle model

Let’s look again at the fishes examples. We now want to include covariates also in the model for the zero probability:

\[ \begin{eqnarray} P(Y = 0) = &\pi\\ P(Y=y) = & (1-\pi) \frac{f_Y(y)}{f_Y(0)},& \qquad y =1,2,\dots \end{eqnarray} \] We use the same “trick” as before and we now have \[ \begin{eqnarray} z\sim\text{Binomial}(\pi,n); &\qquad \eta_1 = \text{logit}(\pi) = \beta^1X\\ z\sim\text{Truncated Poisson}(\lambda); &\qquad \eta_2 =\log(\lambda) = \beta^2X\\ \end{eqnarray} \] In inlabru the truncated Poisson distribution is called nzpoisson

Example - Poisson- Hurdle model

Implementation

  1. Define response for binary and truncated poisson process
zinb  = zinb %>%
  mutate(z = ifelse(count==0,1,0),
         y = ifelse(count==0,NA, count))
  1. Define components, observation models and run
cmp = ~ -1 + Intercept_z(1) + Intercept_y(1) +
  nofish_z(nofish, model = "linear") + persons_z(persons, model = "linear") +
    nofish_y(nofish, model = "linear") + persons_y(persons, model = "linear")

lik_z = bru_obs(formula = z~ Intercept_z  + nofish_z + persons_z,
                data = zinb,
                family = "binomial")

lik_y = bru_obs(formula = y~ Intercept_y  + nofish_y + persons_y,
                data = zinb,
                family = "nzpoisson")

fit = bru(cmp, lik_z, lik_y)

Example - Poisson- Hurdle model

Results

round(fit$summary.fixed,2)
             mean   sd 0.025quant 0.5quant 0.975quant  mode kld
Intercept_z  0.70 0.33       0.04     0.70       1.35  0.70   0
nofish_z     0.23 0.28      -0.33     0.23       0.78  0.23   0
persons_z   -0.18 0.12      -0.41    -0.18       0.04 -0.18   0
Intercept_y  0.35 0.15       0.05     0.35       0.65  0.35   0
nofish_y    -0.83 0.12      -1.07    -0.83      -0.59 -0.83   0
persons_y    0.53 0.04       0.44     0.53       0.61  0.53   0

Example: Binomial-Hurdle model

  • In inlabru there is no implementation for the truncated binomial (or negative binomial) distribution

  • We can “trick” inlabru by using one of the implemented Type 0 models \[ y_i|\eta_i \sim \pi1_{(y=0)} + (1-\pi)\frac{f(y)}{f(0)} \] with a fixed and small \(\pi\) thus creating a truncated distribution!

Example: Binomial-Hurdle model

Implementation

  1. Define response and the components
# Define response for binary and truncated neg. binomial
zinb  = zinb %>% mutate(z = ifelse(count==0,1,0),
                        y = ifelse(count==0,NA, count))
# define model components
cmp = ~ -1 + Intercept_z(1) + Intercept_y(1) + 
  fishing_z(nofish, model = "linear") + fishing_y(nofish, model = "linear") + 
  persons_z(persons, model = "linear") + persons_y(persons, model = "linear") + 
  child_z(child, model = "linear") + child_y(child, model = "linear")
  1. Build the likelihood for the binomial part
# build the observation model
# Poisson model
lik_binom = bru_obs(z ~ Intercept_z + fishing_z + persons_z + child_z ,
                   family = "binomial",
                   data = zinb) 
  1. Define the type 0 Negbinomial with \(\pi\) fixed to a small value
# Negative binomial model
lik_trunc_nbin = bru_obs(y ~ Intercept_y + fishing_y + persons_y + child_y ,
                   family = "zeroinflatednbinomial0",
                   data = zinb,
                   control.family = list(hyper = list(theta = list(initial = -20, 
                                                                   fixed = TRUE))))
  1. Run the model
fit_nbin = bru(cmp, lik_binom, lik_trunc_nbin)

Summary - Treating many zero with inlabru

There are mainly two types of models to treat excess of zeros

  • Zero inflated models
  • Hurdle models

Summary - Treating many zero with inlabru

There are mainly two types of models to treat excess of zeros

  • Zero inflated models
    • Zeros come from two sources
    • Can only be used for counts (Poisson, Binomial,…)
    • Four likelihood are implemented in inlabru (Type1)
    • The probability of zero-inflation is constant
  • Hurdle models

Summary - Treating many zero with inlabru

There are mainly two types of models to treat excess of zeros

  • Zero inflated models
  • Hurdle models
    • Only one source of zeros
    • Positive values can be both discrete and continuous
    • Two ways to imlement in inlabru

Summary - Treating many zero with inlabru

There are mainly two types of models to treat excess of zeros

  • Zero inflated models
  • Hurdle models
    • Only one source of zeros

    • Positive values can be both discrete and continuous

    • Two ways to imlement in inlabru

      1. Use the implemented Type0 likelihoods
      • Probability of zero is constant
      1. Use the “two-likelihood” approach
      • Probability of zero can be modelled